gestural score
- North America > Canada > Quebec > Montreal (0.04)
- Europe > Netherlands > Gelderland > Nijmegen (0.04)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- (6 more...)
- Information Technology (0.92)
- Health & Medicine > Therapeutic Area > Neurology (0.67)
- North America > Canada > Quebec > Montreal (0.04)
- Europe > Netherlands > Gelderland > Nijmegen (0.04)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- (6 more...)
- Information Technology (0.92)
- Health & Medicine > Therapeutic Area > Neurology (0.67)
SSDM: Scalable Speech Dysfluency Modeling
Lian, Jiachen, Zhou, Xuanru, Ezzes, Zoe, Vonk, Jet, Morin, Brittany, Baquirin, David, Mille, Zachary, Tempini, Maria Luisa Gorno, Anumanchipalli, Gopala
Speech dysfluency modeling is the core module for spoken language learning, and speech therapy. However, there are three challenges. First, current state-of-the-art solutions suffer from poor scalability. Second, there is a lack of a large-scale dysfluency corpus. Third, there is not an effective learning framework. In this paper, we propose \textit{SSDM: Scalable Speech Dysfluency Modeling}, which (1) adopts articulatory gestures as scalable forced alignment; (2) introduces connectionist subsequence aligner (CSA) to achieve dysfluency alignment; (3) introduces a large-scale simulated dysfluency corpus called Libri-Dys; and (4) develops an end-to-end system by leveraging the power of large language models (LLMs). We expect SSDM to serve as a standard in the area of dysfluency modeling. Demo is available at \url{https://eureka235.github.io}.
- North America > Canada > Quebec > Montreal (0.04)
- Europe > Netherlands > Gelderland > Nijmegen (0.04)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- (6 more...)
- Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)